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Abstract 

We consider a family of divergence-based minimax approaches to perform robust filtering. The 
mismodeling budget, or tolerance, is specified at each time increment of the model. More precisely, all 
possible model increments belong to a ball which is formed by placing a bound on the Tau-divergence 
family between the actual and the nominal model increment. Then, the robust filter is obtained by 
minimizing the mean square error according to the least favorable model in that ball. It turns out that 
the solution is a family of Kalman like filters. Their gain matrix is updated according to a risk sensitive 
like iteration where the risk sensitivity parameter is now time varying. As a consequence, we also extend 
the risk sensitive filter to a family of risk sensitive like filters according to the Tau-divergence family. 

Index Terms 

Robust Kalman filtering, Tau-divergence family, minimax problem, risk sensitive filtering. 

I. Introduction 

Kalman filter is ubiquitous in many applications. The main reason is due by its iterative 
structure, allowing its implementation very simple. On the other hand, this filter is designed 
with respect to a linear state space model. The latter is often inadequate to describe phenomena, 
accordingly the resulting Kalman filter does not perform well in the practice. Since the beginning, 
therefore, it was clear the importance to develop robust versions of the standard Kalman filter. 

Robust filtering can be performed according to the risk sensitive approach, llT4ll . [fT3l . |fT| , 
0 , iflOll . Here, the robust estimator is designed according to the nominal model but in 
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such a way to avoid large errors. The sensitivity to large errors is tuned by the so called risk 
sensitivity parameter. It is worth noting this approach has been interpreted as a minimax problem 
m, ea, a, mi, 0. The appealing aspect of the risk sensitive approach is that the solution is 
a Kaman like filter. On the other hand, the risk sensitive parameter is not explicitly connected 
to the discrepancy between the actual and the nominal model. Recently, a divergence-based 
minimax approach has been proposed in 0, [0, lf20l . More precisely, in [0 the robust static 
estimation problem of a signal given noisy observations has been considered. Here, all possible 
models belong to a ball which is formed by placing a bound on the Kullbcick-Leibler divergence 
between the actual and the nominal model. This bound, say tolerance, represents the mismodeling 
budged. Then, the robust filter is obtained by minimizing the mean square error according to 
the least favorable model in this ball. It turns out that the Bayes estimator is robust under 
model uncertainty characterized by this ball. In 0, a dynamic extension to this problem (i.e. 
a robust filtering problem) has been considered. More precisely, drawing inspiration from 0, 
ED, the mismodeling budged is specified to each time increment of the model, that is the model 
uncertainty is expressed in an incremental way. Roughly speaking, the idea is to iterate the Bayes 
estimator with the least favorable statistics found in 0. It turns out the robust estimator has a 
Kalman like structure. More precisely, it is a risk sensitivity like filter, where the risk sensitivity 
parameter is now time varying. 

In OBI, the robust static estimation problem proposed in 0 has been extended, in the Gaussian 
case, to a family of uncertainty classes. The latter is formed by placing a bound on a set of 
divergences (called r-divergence family, ifTTlO between the actual and the nominal model. This 
particular divergence family is chosen because, in contrast to the alpha and the beta family lfT8l . 
m, urn it allows to characterize uncertainty balls for which the Bayes estimator is still robust. 

The contribution of this paper is to extend the robust Kalman filter in 0 to a family of robust 
Kalman filters parametrized by the r-divergence family using the results in Ifl9l . This family 
of filters is characterized by a time varying risk sensitive parameter. Therefore, by adopting the 
perspective given in 0, we also extend the risk sensitive filter to a family of risk sensitive like 
filters parametrized by the r-divergence family, say r-risk sensitive filters. Finally, we present a 
simulation study which shows that parameter r tunes how conservative the robust filter is. 

In the paper we will use the following notation. ||x|| denotes the Euclidean norm of x e M”. 
||x||.a denotes the weighted Euclidean norm with A symmetric and positive definite. 
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II. Robust Static Estimation 


We review the robust static estimation problem under model uncertainty characterized by the 
r-divergence family introduced in [fT9ll . Let x G R n and y G W 1 be two jointly Gaussian random 
vectors. Let z := [ x T y T ] 7 . Its joint nominal probability density / is 

f(z) = — 7 exp ( —(z — m z ) T K^iz — m z ) 

K J a/( 27r)P +n det K z \ 2 l V ' 

where the mean vector m z G M n+P and the covariance matrix K z G Q" +/ ' are known. We 

conformably partition the mean vector and the covariance matrix of z according to x and y: 


rn T = 


1 

8 

_1 

, K z = 

K x 

i- 




i 


Let / denote the actual joint probability density of 2 


/» = 


exp ( -~(z — m z ) T K z x (z - m z ) 


(2vr)P+ n det K z 

where the mean vector rh z G M n+P and the covariance matrix K z G <2" +p are unknown. Since 
both / and / are Gaussian, the deviation between / and / may be directly measured by the 
deviation between ( m z ,K z ) and (■ m z ,K z ) through the r-divergence, [|T9ll : 

2M/||/) = 

+ tr (- log^GAA 1 ) 

+K z Kr 1 


I Am 112 


211 T_ A '" 1 


-n+p y j 

+ tr 


r = 0 


+ if, 


n+p I ) 


0 < r < 1 


( 1 ) 


<5oo(Am z ) + tr (L 2 1 A" 2 A 2 T log(A z 1 A",A 2 T ) 


Az a; + i n +p j , 


r = 1 


where A 2 is such that A', = L Z L^, A m z = rh z — m z , and 


Soo(v) : = 


0 if v = 0 
00 otherwise. 


Note that, V T (f\\f) > 0 and equality holds if and only if / = /. This divergence takes root in 
the prediction theory. Let e N = L~ 1 (z — m z ) with z ~ f. e N can be understood as a normalized 
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prediction error, where m z represents the minimum variance prediction of z based on /. If 
/ = /, then e A has zero mean and covariance matrix I. Hence, this divergence measures the 
discrepancy between e N and the Gaussian random vector with zero mean and covariance I. We 
consider the closed ball centered on /: 

B T := {/ s.t. Z> T (/||/) < c} (2) 


where c e M + is a fixed tolerance. Accordingly, B r represents the set of all possible probability 
densities of z consistent with the allowed mismodelling budget. 

The robust estimator of x given y is designed according to the minimax point of view |[8l, [f6l. 
More precisely, whenever we seek to design an estimator minimizing a suitable loss function, 
an hostile player, say “nature”, conspires to select the worst possible probability density in B r . 
Let g(y) denote an estimator of x based on the observation vector y. The optimal robust filter 
is solution to the following minimax problem 


where 


min max J(f, g) 

9^G f&Br 


(3) 


= E/[lk - 9(y)\\ 2 } = [ Ik — g(y)\\ 2 f 

jR n +P 

denotes the mean square error and Q denotes the set of all estimators g(y) such that E^[|| <y(y) || 2 ] 
is finite for any / e B r . 

Theorem 2.1: Let 0 < r < 1. The least favorable probability density f° has mean vector 
rh° z = m z and covariance matrix with the following structure 


K° = 


K, r K 


xy 


K K 

1 v 1IX 


(4) 


t-yx J.'-y 

wherein only the covariance of x is perturbed with respect to the nominal covariance matrix. 
Let 


P = K x — K xy K~ l K yx 

V = K x - K xy K~ l K yx (5) 


denote the nominal and the perturbed a posteriori covariance matrix of x given y. Then, 

L P (/„ - 9(1 - t)L t p L p ) ^ L t p , 0 < t < 1 


V = 


L P exp (OlPpLp) Lip, 


r = 1 


( 6 ) 
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where L P is such that P = LpLp. Here 9 \ with 9 1 > (1 — r)||P||, is the unique Lagrange 
multiplier satisfying the relation c = 7 T (P, 9) where 


7 r(P,9) = 

- log det(/ n - 9P)~ l + tr((/ n - PP) _1 - I n ), r = 0 
tr(—^^(4-0(1-r)L£L P )^ 

+ 1^(4 -0(l-r)L^L P )^ + i/ n ), 0 < t < 1 

tr(exp(0LpLp) (9L 1 P L P - I n ) + /„), r = 1 

The optimal robust estimator is the Bayes estimator 

9°(y) = G°(y-m y ) + m x (8) 


with G° = K xy K~\ 

Theorem 2T] shows that the Bayes estimator is robust with respect to the uncertainty class, 
parametrized by r, in (|2]). Clearly, this optimality holds in the Gaussian case. Without this 
assumption, the least favorable probability density could be more different than the one in 
Theorem 12.11 

Corollary 2.1: Let 6 > 0 be a priori fixed and such that 6~ l > (1 — r)||P||. Consider the 
minimax problem 


min max E f [\\x - g(y )|| 2 ] - 9 1 P T (/||/) 
g&<3 feB T 

where £> r = {/ s.t. V T (f\\f) < 00 } and Q is the set of all estimators such that E^[||g(?/)|| 2 ] is 
finite for any / e B T . Then, the least favorable probability density / t ° has mean vector rh° = m z 
and covariance matrix K° as in (|4|. The perturbed a posteriori covariance matrix V of x given 
y is in 0. Moreover, its relation with P is given by ([6]) where 9 now has been a priori chosen. 
The optimal estimator is the Bayes estimator ([8]). 


III. ROBUST FILTERING PROBLEM 
We consider a nominal Gauss-Markov state space model of the form 

x t+x = A t x t + B t v t 

y t = C t x t + D t v t (9) 
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where x t G M n is the state process, y t G M p is the observation process, and v t G M m is WGN 
with unit variance, i.e. K[v t vJ] = I m St-s where 5 t denotes the Kronecker delta function. We 
assume that the noise v t is independent of the initial state, whose nominal distribution is given 
by fo(x 0 ) J\f(x 0 ,V 0 ). Let z t = [xj +1 yf ] T . Model 0b is characterized by the nominal 
transition probability density of z t given x t : 


Mzt\x t ) ~ A f 


At 

C t 


x t , 


B t 

D t 



GO) 


As noticed in when entropy-like indexes are used to measure the proximity of statistical 
models, all the relations between dynamic variables or observations should be uncertain, other¬ 
wise those indexes take infinite value. To avoid such a situation, we assume that the noise v t 
affects all the components of the dynamics and observations in (|9]), possibly with a very small 
variance for relations which are viewed as almost certain. Therefore, the covariance matrix 


K i = 


B t 


r i 

D t 


BJ DJ 




is positive definite. Moreover, the matrix T t = [ Bj D j ] T has full column rank, and without 
loss of generality we can assume T t is square and invertible, so that m — n + p. Otherwise, we 
can compress the column space of T t and remove noise components which do not affect model 


S- 

We adopt the minimax approach proposed [|9] Section III] to characterize the robust filter. Let 
<fi s (z t \xt) be the least favorable transition probability density of z t given x t . Let f t (x t \Y t -i) be the 
a priori probability density of x t conditioned on the observations Y t -\ = {y s , 0 < a < 1 — 1} 
and based on the least favorable model. We introduce the marginal probability densities 


/t(z t |y*_i) = I (j)t(zt\xt)ft(xt\Yt-i)dx t (11) 

/t(z t |y t _i) = I (ptiztlx^ftixtlYt-^dxt. (12) 

Note that, f t (z t \Y t _i) can be viewed as the pseudo-nominal density of z t conditioned on Y t _i 
computed from the conditional least favorable density ft{x t \Y t -\) and the nominal transition 
probability density (f) t {x t \z t ). As in we assume that 


/t(x t |y t _i) ~ A f(x t , V t ). 


(13) 
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In this way the conditional probability density f t (z t \Y t _i) is Gaussian. We make the additional 
assumption that <j(z t \x t ) is such that f t {z t \Y t -i) is Gaussian. In j9l|, the latter assumption was 
not made. However, it is worth noting that the least favorable solution found is such that ( ff2| ) 
is Gaussian, see Remark 3.2 Therefore, we can measure the deviance between (j) t and o t as 


deviance between f t (z t \Y t -i) and f t (z t \Y t -i) using the r-divergence Then, we assume that 
4> t belongs to the closed ball about B t , T = {4>t(zt\%t) s.t. V T (f t \\f t ) < c t } where c t G M+ 
is the tolerance specified at each time step. Let Q t denote the class of estimators with finite 
second-order moments with respect all densities 4>t(zt\xt) ft(xt\Yt-i) such that 4> t (z t \xt) G B t . r . 
The the robust filter is characterized by the following minimax problem 


where 


= ar § rni T m ax J t {<f>t,gt ) 

9t£Gt<j> t eBt, T 


Jt{4>t,9t ) = E/JIkm ~ g t {yt)\\ 2 \Y t -i\ 


(14) 



\\x t +i - gt(yt)\\ 2 &t(z t \xt)ft( x t \ Y t -i)dx t dzt 

denotes the mean square error of the estimator x t +i = g°(yt) of x t +i evaluated with respect to 
the transition density 0 t in B t , T - It is worth noting that x t +i depends on Y t , and not only on y t , 
but this dependency is suppressed to simplify the notations. 


Remark 3.1: In the minimax problem (14) we require that fi (z t \ \) defined in (12) is a 

conditional probability density, that is 



$t{zt\xt)ft(xt\Yt- 1 )dz t dxt = 1, 


(15) 


but we do not require that <j>t{zt\xt) is a transition probability density for each x t . Therefore, 
the a priori conditional probability density f t {x t \Y t -\) is not required to coincide with the a 
posteriori one computed from ^ t (z t \x t )f t (x t \Y t _i). 


Remark 3.2: In fl9], \x t ) is not required to be such that (12) is Gaussian. The constraint 


on 4> t (z t \x t ) is that H>KL(ft\\ft) < c t where II kl is the Kullback-Leibler divergence among 
probability densities. On the other hand, the solution 0£(z t |ay) to the corresponding minimax 


problem is such that (12) is Gaussian, see fl9[ Formula (16)]. Hence, the corresponding 


is Gaussian. Note that, T> K L(ft\\ft) = ^M/tll/t) when f t , f t are Gaussian. We conclude that, 
for r = 0, the solution to (fl4|) coincides with the one in Q- 
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IV. Robust Kalman Filters 

We show that the optimal robust estimator solution to the minimax problem (jT4]) is a Kalman 
like filter parametrized by r. In this way, we obtain a parametric family of robust Kalman filters. 
First, Problem (14) can be reformulated as the static minimax problem ([3]). Consider the ball 
B l r = {/,( z t 1 Y t _|) s.t. V T (f t \\fi) < c t } which is the set of all probability densities having 
structure (12) with <f> t E Bi r . The equivalent minimax problem is 


{ft > 9t) = ar £ rni J3 max Mft,9t) 

9te6tf t eB t , T 


where 


Jt(ft,9t ) = I \\xt+i ~ 9t(yt)\\ 2 ft( z t\ Y t-i)dz t . 

In view of (jTO]) and (|T3|), the pseudo-nominal density is Gaussian 


ft{zt\Y t -i) ~ M 


At 

C t 


x t , K z 


(16) 


where the conditional covariance matrix K Zt is given by 


K* = 


K 


0Ct +1 


K 


xt+i,yt 


K 


yt,x t +i 


K. 


yt 


At 

C t 


Vt 


AT cT 


+ 


B t 

D t 


bt d 


T 


(17) 


Applying Theorem 2.1 with f ^ ft, f ^ ft an d g ^ gt- the least favorable conditional 


density ff(z t \Y t _i) is such that 

ftiztlY^) ~ 


A f 


At 

C t 


x t , K, 


Zt 


(18) 


where the least favorable conditional covariance matrix is 


K° = 

Z t 


K 


Xt+1 


K 


Kx+iyt 


K, 


ytxt+i 


K, 


yt 


Let 


P±- i-i — K, 


t +1 ± '-X t +1 

Vt+i = K xt+1 - K. 


K 1 K 

1 '-x t +i,yt 1 '-y t 1 '-yt,xt +1 


T{- 1 K 

xt+i,yt lv y t lv yt,x t +1 
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denote the nominal and the least favorable conditional covariance of x x+ \ given Y t . Then, 

Vt+i = 

[ Lp t+l (/„ - 0,(1 - r)L? (+1 L Pt+1 )^ LJ (+1 , 0 < r < 1 

1 in, exp ip,, ) ip_, ■ r = 1 

where L Pt+1 is such that P m = L Pt+1 L^, t+i and > (1 — r)||Pt + i|| is the unique solution to 
c t = 7 T (Pt+i, 9 t ) where 7 T has been defined in (J7j). The optimal robust estimator takes the form, 

x t+1 = g° t {y t ) = A t x t + G t (y t - C t x t ) (19) 

with gain matrix G, = K Xt+ljyt K~ l . From (jT7j), we obtain 

G t = ( A t V t C? + B t Dl)(C t V t Cj + D t Dj)- ] 

Pt+1 = AVtAf - G t (C t y t Cj + D t D?)Gj + B t B T t . (20) 

Algorithm [T] shows the iterative scheme of the optimal robust estimator we found for the 
case 0 < r < 1. The algorithm for the limit case r = 1 is the same with the exception that the 
updating of V t+ \ is different. It is clear that the robust filter has the same iterative structure of the 
Kalman filter with the exception that P t is applied a distortion through matrix V t . In particular, 
G t is governed by a Riccati-like equation. 


Algorithm 1: Robust Kalman filter at time t 

Input : ct, x t , V t , y t 

Output: x t+1 , V t+1 

1 G t = (AtVtCt + B t Dl) T {C t y t Cj + DtDf)- 1 

2 x t+ i = A t x t + G t (y t ~ C t xt) 

3 P t+1 = A t V t Aj - G t (Cy t Cj + D t Dj)Gj + B t B T t 

4 Find 9 t such that c t = 7 T (Pi+i, 6 * 4 ) 

s Compute V t+1 = L Pt+1 (l n - 9 t { 1 - T)L T Pt+i L Pt+1 ) ^ L T Pt+i 


It remains to characterize the least favorable transition density y(z t \x t ). It is not difficult to 
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prove that, [|19l Theorem 2.1], 


K~ l = 

Zt 


(Kr 1 = 


-1 

3^ 

1 

O 


-GJ I p 


1 

3^ 

I 

o 


-G T t I p 



p< 


-i 
t +1 


o K vt 


v t +\ 0 

0 K n 



1 

3 ' 

—Gt 


0 

- 1 


1_ 

1 

_1 


1 

o 

I P 


Accordingly 


k ;, 1 - ( k \ 


Zt' 


~G T t 


<f>, 


-G t 


where $ t = P t+ \ — V t+ \ which is positive definite. Let e t = x t — x t denote the estimation error. 
Define 

1 T 


m zt = Ef t [zt\Y t _i] = Ejo[z t |y t _i] = 


AJ C? 


x t . 


Therefore, 


{z t - m Zt ) T (K Zt l - (K° t ) l )(z t - rh Zt ) 


= (zt - m Zt f 


-GI 


$t. 


In ~ G+ 


(z t ~ rn 


zt) 


= \\x t+1 - (A t x t + G t (y t - C t x t ))||| 


l^t+i — x t+ 1 \\l t — ||et+i||$ t . 


By ( |T8| ) and ( [T6| ), we have 


ft(z t \Y t -i) ~ exp ( -||e t+ i|| $t ) / t (z t |y t _i). 


By ( [IT] ), we obtain 

and by ([l2|) we conclude that 


exp ( -\\e t+ i\\i t ] (j) t (z t \x t )ft(xt\Y t -i)dxt. 


h( z t\xt) = 


M t ($ t 


ex P o liet+1 II (pt(Zt\x t 


( 21 ) 


where the normalizing constant is such that ( [T5j ) holds. It is worth noting that in the 

case r = 0, i.e. the case considered in the distortion is a radial function of the estimation 
error e t +i, because <!>/ = 0[ 1 /,, for r = 0. On the contrary, in the case r ^ 0 such distortion is 
nonradial. 
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V. Least-Favorable Model 


For simulation and performance evaluation purposes, in particular for choosing parameters c t 
and r, it is important to characterize the least favorable model which is the solution to ( fl4] ). The 
idea is to characterize it through ( |2Tj ). Note that, there is a one to one correspondence between 
z t and Vt, given x t , through the relation 



because matrix T t is invertible. Accordingly, we can characterize the least favorable model with 
model ([9]) where the distortion has been moved now in noise v t . Applying the same arguments 
used in [J9|, see also [(6] Section 17.7], it is not difficult to prove that the least favorable probability 
density of v t depends on e t and is distributed as Bt(v t \e L ) ~ A f(H t e t , K Vt ) where K Vt = ( I n+p — 
(B t - G t D t ) T (yL~l i + - G t D t ))~ l and H t = K Vt (B t - G t A) T (^+i + $ t )(A t - G t C t ). 

Matrix is computed from the backward recursion 

H*" 1 = (A - G t C t ) T {n~l i + $ t )(A t - G t C t ) + H?R- t l H t (22) 

where the final point can be initialized with = 0 and T is the simulation horizon. The 

backward recursion is due by the fact that integrating 4>° t (z t \x t ) over z t we obtain a positive 
function of e t , therefore the “nature” has the opportunity to change retroactively the least 
favorable density of x t . It is not difficult to see that the least favorable model admits a state 
space representation with matrices 


— 


A t B t H t 

0 At — GtCt + (Bt — G t D t )H t 


(23) 



B t 


r 

B t = 

B t - G t D t 

Lt, C t — 

C t D t Ht 


D t = D+L 




and the input is WGN with unit variance. Note that, to construct the least favorable model, 
first we generate the gains G t performing a forward sweep of the robust filter <[T9|)-(|2()]) over 
interval [0, T], then we generate the matrices through a backward sweep over interval [0, T\. 
Therefore, increasing the simulation interval beyond [0, T] requires performing a new backward 


sweep of recursion (22). Then, we can evaluate the performance of an arbitrary estimator 


x t+l = A t x\ + G' t (y t - Ctx' t ) 


(24) 
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applied to the the least favorable model. Let 

If* — IE 

e t 


(e[? 


where e t is the estimation error of the optimal filter (19) and e[ is the estimation error of filter 


( |24| ). Then, it can be proven that II t obeys to the Lyapunov equation, 

Ilt+i = ( A t 


G' 


0 


(- 

G' 


r 

l 

O 

A ) 


(t- 

/ 

G[ 

0 

6 ) 

\ 


G[ 

0 



(25) 


where n 0 = h <S> Vo. 


o- 


VI. Risk Sensitive Filtering 


Consider the robust Kalman filter we presented in Section IV with r = 0. If we replace 9, 
with a constant value 9 we recognize immediately that we obtain the risk sensitive filter, 
m, urn This suggest us that the risk sensitive filter can extended using the r-divergence family. 
Consider the Markov-Gauss state space model ([9]). Let f t (z t \Y t _i) be the conditional density of 
z t given Y t -1 based on the model ([9]) and defined in ( fl6] ). The classic risk sensitive estimator g° t 
at time t is defined as 


9t = argminE /t [exp(0||x t+ i - g t (y t ) || 2 ) | Y t - i] 
gt&Gt 

where Q t is the set of estimators for which the objective function in 


(26) 

is finite. 6 > 0 is the 


risk sensitivity parameter. More precisely, the larger 9 is the more the objective function in (26) 
penalizes estimators with large errors. In [j2]|, it has been shown that the risk sensitive estimator 
is solution to the following minimax problem 


g° t = argminmaxE ? [11 xt+i - g t {y t )|| 2 | Yt- i ] 

gt&Gt ft&St 

- 9- 1 B KL (ft\\ft) 


(27) 


where B t = {s.t. rift]].ft.) < cxd}. The second term in the objective function in (27) 
is always nonpositive because Tb K L(ft\\ft) > 0. For small values of 9, it takes large negative 
values for conditional densities not close to the nominal one. Therefore, the maximizer is obliged 
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to choose a conditional density close to the nominal one. On the contrary, for large values of 9, 
it takes (negative) values close to zero for some conditional densities not close to the nominal 
one. In such a situation, the maximizer has the possibility to choose those conditional densities. 
Note that, this behaviour does not change if we replace D kl with another divergence measure. 

In our setting f t (z t \Y t -i) is Gaussian by assumption. In addition, if we assume that f t (z t \Y t -i) 
is Gaussian, then Y) KL (ft\\ft) = A)(/ill//) where V {] has been defined in (jT|). It is then natural 
to extend the minimax problem ( |27| ) to the r-divergence family: 

9t = argminmaxE? [ 11 x t +i - g t (yt)\\ 2 \ Yt-\} 

gt&Gt ft&Bt 

-9- l V r if t \\f t ) 


By applying Corollary [271 
where 


the optimal r-risk sensitive estimator takes the form of (|T9|)-(|20|) 


V t+1 = 


L Pt+1 {l n -9{l-r)Ll. L 


p t+1 ^Pt+i 
T 


) T ~ 1 L T Pt+1 , 0 < r < 1 


T — 1. 


(28) 


L Pt+1 exp (6L T Pt+i L Pt+1 ) L Pt+1 , 

It is worth noting that, for the case 0 < r < 1, V)+i is defined provided that 0 < Pt +i < 
(0(1 — r)) '/,,, while for the case r — 1, it is well defined whenever P t +\ is positive definite. 
The algorithmic scheme is similar to Algorithm [Tj the unique difference is that Step 4 is now 


removed. Finally, while the risk sensitivity parameter of the robust filter of Section IV is time 
varying, and its evolution is governed by c t , now it is constant. 


VII. Simulation Results 
We consider the time-invariant model ([9]) with 



0.1 

1 


0.01 

0 

0 

A = 

0 

1.2 

, B = 

0 

0.01 

0 





r 




C = 


1 -1 


D = 


0 0 0.1 


and x 0 ~ 7/(0, Vo) with Vo = 0.01 -/ 2 . We consider the following three filters: KF is the standard 
Kalman filter; RKF 0 is the robust Kalman filter of Section [IV] with r = 0; RKF, is the robust 
Kalman filter of Section IV with r = 1. In Figure [T| we show the evolution of the risk sensitivity 
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parameter of RKF 0 and RKF| for c = 10 _1 . We notice the 9 t is constant in the steady state, that 
is RKF 0 and RKFi coincide with the risk sensitive filters of Section |VT] with r = 0, 0 ~ 0.19 
and t = 1, 6 fs 0.23, respectively, in the steady state. 



Fig. 1. Evolution of the risk sensitive parameter 9 t with c = 10 1 


In what follows we evaluate the performance of RK, RKF 0 and RKF,, which have structure 


[), applied to the least favorable model (j23j). More precisely, for each filter, applied to ( [23] ), 
we consider the estimation error e[ = [ e) ef ] 7 . Then, we compute the variance of e] and ef 
through ( |25] ). We consider two situations: c large, i.e. nominal and least favorable model are 
very different; c small, i.e. nominal and least favorable model are similar. 


A. Large tolerance 

Here RKF 0 and RKFi have tolerance c = 10 _1 . In the first experiment, we apply these filters 
to the nominal model ([9]). The variance of e\ and ef are depicted in the first row of Figure [2j As 
expected, KF performs better than the others. Moreover, the variances of RKF 0 are slightly larger 
than the ones of RKFi. In the second experiment, we apply these filters to the least favorable 
model ( |23j ) with r = 0 and c = 10 -1 . The variances of e) and e| are depicted in the second row 
of Figure [2j Obviously, RKF 0 is the best estimator because it has been designed with respect to 
this model. Although RKFi has been designed with respect to another model, it performs better 
than KF. In the third experiment, we apply these filters to the least favorable model ([23]) with 
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Fig. 2. Variances of e\ and (? t when the filters are applied to the nominal model (first row); to 1 23 1 with r = 0 and c = 


(second row); to d23l with r = 1 and c = 10 (third row). Here RKF 0 and RKFi have c = 10 


KT 1 
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r = 1 and c = 10 _1 . The variances of e] and e'f are depicted in the third row of Figure [2j In 
this case RKFi is the best estimator because it is optimal with respect to the underlying model. 
Also in this case, the worst estimator is KF. From these experiments we can conclude that: 

. the smaller r is, the more conservative the filter is, that is, the smaller r is, the more the 
uncertainty class contains models with larger mean square error. This property has been 
noticed also for the static estimation problem in lH9l . 

• the family of robust Kalman filters provide better performances than the standard Kalman 
filter, even in the case that the least favorable model belongs to an uncertainty class 
parametrized by a different r. 


B. Small tolerance 

We perform the same experiment three experiments of before where the unique difference 
is the tolerance which now is c = 5 • 10 3 , see Figure [3j RKF 0 and RKF, provides the same 
performance which is comparable with the one of KF. Therefore, as long as the discrepancy 
between the nominal and the least favorable model is not too large, then the performance of KF 


applied to (23) does not deteriorate too much. 


VIII. Conclusions 

In this paper, we have considered a robust filtering problem under incremental model pertur¬ 
bations characterized by the r-divergence family. The family of robust estimators we proposed 
is the solution to a minimax problem. These robust estimators have an iterative structure similar 
to the one of the Kalman filter. We have derived the corresponding least favorable models. 
Moreover, we have extended the risk sensitive filter to a family of risk sensitive like filters. 
Finally, a simulation study shows that parameter r tunes how conservative the robust filter is. 
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Si) 


W 




Fig. 3. Variances of e\ and e? when the filters are applied to the nominal model (first row); to 1 23 i with r = 0 and c = 


(second row); to (J 23 J with r = 1 and c = 5 • 10 3 (third row). Here RKF 0 and RKFi have c = 5 • 10 3 . 
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